320 research outputs found
Practical Sparse Matrices in C++ with Hybrid Storage and Template-Based Expression Optimisation
Despite the importance of sparse matrices in numerous fields of science,
software implementations remain difficult to use for non-expert users,
generally requiring the understanding of underlying details of the chosen
sparse matrix storage format. In addition, to achieve good performance, several
formats may need to be used in one program, requiring explicit selection and
conversion between the formats. This can be both tedious and error-prone,
especially for non-expert users. Motivated by these issues, we present a
user-friendly and open-source sparse matrix class for the C++ language, with a
high-level application programming interface deliberately similar to the widely
used MATLAB language. This facilitates prototyping directly in C++ and aids the
conversion of research code into production environments. The class internally
uses two main approaches to achieve efficient execution: (i) a hybrid storage
framework, which automatically and seamlessly switches between three underlying
storage formats (compressed sparse column, Red-Black tree, coordinate list)
depending on which format is best suited and/or available for specific
operations, and (ii) a template-based meta-programming framework to
automatically detect and optimise execution of common expression patterns.
Empirical evaluations on large sparse matrices with various densities of
non-zero elements demonstrate the advantages of the hybrid storage framework
and the expression optimisation mechanism.Comment: extended and revised version of an earlier conference paper
arXiv:1805.0338
An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation
Modelling of multivariate densities is a core component in many signal
processing, pattern recognition and machine learning applications. The
modelling is often done via Gaussian mixture models (GMMs), which use
computationally expensive and potentially unstable training algorithms. We
provide an overview of a fast and robust implementation of GMMs in the C++
language, employing multi-threaded versions of the Expectation Maximisation
(EM) and k-means training algorithms. Multi-threading is achieved through
reformulation of the EM and k-means algorithms into a MapReduce-like framework.
Furthermore, the implementation uses several techniques to improve numerical
stability and modelling accuracy. We demonstrate that the multi-threaded
implementation achieves a speedup of an order of magnitude on a recent 16 core
machine, and that it can achieve higher modelling accuracy than a previously
well-established publically accessible implementation. The multi-threaded
implementation is included as a user-friendly class in recent releases of the
open source Armadillo C++ linear algebra library. The library is provided under
the permissive Apache~2.0 license, allowing unencumbered use in commercial
products
Armadillo: An Open Source C++ Linear Algebra Library for Fast Prototyping and Computationally Intensive Experiments
In this report we provide an overview of the open source Armadillo C++ linear algebra library (matrix maths). The library aims to have a good balance between speed and ease of use, and is useful if C++ is the language of choice (due to speed and/or integration capabilities), rather than another language like Matlab or Octave. In particular, Armadillo can be used for fast prototyping and computationally intensive experiments, while at the same time allowing for relatively painless transition of research code into production environments. It is distributed under a license that is applicable in both open source and proprietary software development contexts. The library supports integer, floating point and complex numbers, as well as a subset of trigonometric and statistics functions. Various matrix decompositions are provided through optional integration with LAPACK, or one its high-performance drop-in replacements, such as MKL from Intel or ACML from AMD. A delayed evaluation approach is employed (during compile time) to combine several operations into one and reduce (or eliminate) the need for temporaries. This is accomplished through C++ template meta-programming. Performance comparisons suggest that the library is considerably faster than Matlab and Octave, as well as previous C++ libraries such as IT++ and Newmat. This report reflects a subset of the functionality present in Armadillo 0.9.92
On Robust Face Recognition via Sparse Encoding: the Good, the Bad, and the Ugly
In the field of face recognition, Sparse Representation (SR) has received
considerable attention during the past few years. Most of the relevant
literature focuses on holistic descriptors in closed-set identification
applications. The underlying assumption in SR-based methods is that each class
in the gallery has sufficient samples and the query lies on the subspace
spanned by the gallery of the same class. Unfortunately, such assumption is
easily violated in the more challenging face verification scenario, where an
algorithm is required to determine if two faces (where one or both have not
been seen before) belong to the same person. In this paper, we first discuss
why previous attempts with SR might not be applicable to verification problems.
We then propose an alternative approach to face verification via SR.
Specifically, we propose to use explicit SR encoding on local image patches
rather than the entire face. The obtained sparse signals are pooled via
averaging to form multiple region descriptors, which are then concatenated to
form an overall face descriptor. Due to the deliberate loss spatial relations
within each region (caused by averaging), the resulting descriptor is robust to
misalignment & various image deformations. Within the proposed framework, we
evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder
Neural Network (SANN), and an implicit probabilistic technique based on
Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and
ChokePoint datasets show that the proposed local SR approach obtains
considerably better and more robust performance than several previous
state-of-the-art holistic SR methods, in both verification and closed-set
identification problems. The experiments also show that l1-minimisation based
encoding has a considerably higher computational than the other techniques, but
leads to higher recognition rates
Modelling Local Deep Convolutional Neural Network Features to Improve Fine-Grained Image Classification
We propose a local modelling approach using deep convolutional neural
networks (CNNs) for fine-grained image classification. Recently, deep CNNs
trained from large datasets have considerably improved the performance of
object recognition. However, to date there has been limited work using these
deep CNNs as local feature extractors. This partly stems from CNNs having
internal representations which are high dimensional, thereby making such
representations difficult to model using stochastic models. To overcome this
issue, we propose to reduce the dimensionality of one of the internal fully
connected layers, in conjunction with layer-restricted retraining to avoid
retraining the entire network. The distribution of low-dimensional features
obtained from the modified layer is then modelled using a Gaussian mixture
model. Comparative experiments show that considerable performance improvements
can be achieved on the challenging Fish and UEC FOOD-100 datasets.Comment: 5 pages, three figure
Subset Feature Learning for Fine-Grained Category Classification
Fine-grained categorisation has been a challenging problem due to small
inter-class variation, large intra-class variation and low number of training
images. We propose a learning system which first clusters visually similar
classes and then learns deep convolutional neural network features specific to
each subset. Experiments on the popular fine-grained Caltech-UCSD bird dataset
show that the proposed method outperforms recent fine-grained categorisation
methods under the most difficult setting: no bounding boxes are presented at
test time. It achieves a mean accuracy of 77.5%, compared to the previous best
performance of 73.2%. We also show that progressive transfer learning allows us
to first learn domain-generic features (for bird classification) which can then
be adapted to specific set of bird classes, yielding improvements in accuracy
Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences
This paper introduces sparse coding and dictionary learning for Symmetric
Positive Definite (SPD) matrices, which are often used in machine learning,
computer vision and related areas. Unlike traditional sparse coding schemes
that work in vector spaces, in this paper we discuss how SPD matrices can be
described by sparse combination of dictionary atoms, where the atoms are also
SPD matrices. We propose to seek sparse coding by embedding the space of SPD
matrices into Hilbert spaces through two types of Bregman matrix divergences.
This not only leads to an efficient way of performing sparse coding, but also
an online and iterative scheme for dictionary learning. We apply the proposed
methods to several computer vision tasks where images are represented by region
covariance matrices. Our proposed algorithms outperform state-of-the-art
methods on a wide range of classification tasks, including face recognition,
action recognition, material classification and texture categorization
Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients
In this paper we propose a novel approach to multi-action recognition that
performs joint segmentation and classification. This approach models each
action using a Gaussian mixture using robust low-dimensional action features.
Segmentation is achieved by performing classification on overlapping temporal
windows, which are then merged to produce the final result. This approach is
considerably less complicated than previous methods which use dynamic
programming or computationally expensive hidden Markov models (HMMs). Initial
experiments on a stitched version of the KTH dataset show that the proposed
approach achieves an accuracy of 78.3%, outperforming a recent HMM-based
approach which obtained 71.2%
Matching Image Sets via Adaptive Multi Convex Hull
Traditional nearest points methods use all the samples in an image set to
construct a single convex or affine hull model for classification. However,
strong artificial features and noisy data may be generated from combinations of
training samples when significant intra-class variations and/or noise occur in
the image set. Existing multi-model approaches extract local models by
clustering each image set individually only once, with fixed clusters used for
matching with various image sets. This may not be optimal for discrimination,
as undesirable environmental conditions (eg. illumination and pose variations)
may result in the two closest clusters representing different characteristics
of an object (eg. frontal face being compared to non-frontal face). To address
the above problem, we propose a novel approach to enhance nearest points based
methods by integrating affine/convex hull classification with an adapted
multi-model approach. We first extract multiple local convex hulls from a query
image set via maximum margin clustering to diminish the artificial variations
and constrain the noise in local convex hulls. We then propose adaptive
reference clustering (ARC) to constrain the clustering of each gallery image
set by forcing the clusters to have resemblance to the clusters in the query
image set. By applying ARC, noisy clusters in the query set can be discarded.
Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method
outperforms single model approaches and other recent techniques, such as Sparse
Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant
Analysis.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV),
201
Bags of Affine Subspaces for Robust Object Tracking
We propose an adaptive tracking algorithm where the object is modelled as a
continuously updated bag of affine subspaces, with each subspace constructed
from the object's appearance over several consecutive frames. In contrast to
linear subspaces, affine subspaces explicitly model the origin of subspaces.
Furthermore, instead of using a brittle point-to-subspace distance during the
search for the object in a new frame, we propose to use a subspace-to-subspace
distance by representing candidate image areas also as affine subspaces.
Distances between subspaces are then obtained by exploiting the non-Euclidean
geometry of Grassmann manifolds. Experiments on challenging videos (containing
object occlusions, deformations, as well as variations in pose and
illumination) indicate that the proposed method achieves higher tracking
accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques
and Applications, 201
- …